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^ ] Abstract 

The purpose of this study was to determine if correlations among 
student ratings Itemp deoigncd* to ba diagnostic could be lowered 
through use of spaclal insivuctions to raters. The authors argue 
that the lowering of Inter-ltan correlations is indicative of a 
reduction, of the halo effe^ct which leads to greater item diag- 
nosticlty. The experimental group first ranked items in terms 
of itoportance, then rated the course with the diagnbstic items, 
^then rated the course irLth the general items. This | order was 
wversed for the control group. The correlations among items 
were significantly lower for the experimental groixpi 
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, ... --^ fc-' -r f, . , ' S^- 

of, Standard Instructldhs . » . ' 

• ^ ' ^ '' • - •' • 'St- 

Richard VJ. Naccarato, ^^Irlam'X; Goldstein;; ^||d ; \ ^ 

Gerald Glllmore ' , ^ ' 



Students' evaluations of college courses continue to receive attention ^ 
from faculty and administration In Institutions of higher learning. In a recent 
paper on student-faculty evaluations Permut speaks of the* demand for '^account- 
ability In higher education'' and of the brighter spotlight being shone * upon o - 
Student evaluations, not^only by administrators and faculty, but by' students and 
goverameptal agencies (1974, p. 41.). Whethei: student ratings are to be used \ 
for administrative' decision-making or instructional Improvement, it is desirable 
to reduce the effect of extraneous factors on the resul-ts. This study centers 
itself around the "halo^ effect that apparently exists in many rating, situations, - ^ 
and assesses the impact of a strategy to reduce this s^ixtamlnatlng effect ^upon ^ 
the usefulness of student ratings for diagnosis, of instructional problems. 

Costln, Greenough, and l^fenges (1971) defined the halo effect as the tendency »^ 
of raters to re8pond.--sljnilarly to all4 items on the basis of som6 set Impression. 
The origin or causes of these set Impressions *is relatively unknown; however, 
most studies have attributed the Impressions to various perceptual and attltu- ' 
dinal prd^cessea within the individual. Widlak, IIcDaniel, and FeLdhusen- (Note 3) 
performed a factor analysis of student ratings results In order rto assess existing 
halo effects • Using the Course-Instructor Evaluation (CIE) from Purdue ^ 
University, they correlated 18 evaluation items and, concluded that the halo\ - 
effect was so strong in the CIE that the specific item ratings^may have little \ ' 



diagnostic value , in assessing, a te^hat* s strengths and weaknesses. In d 



Student Ratings Effect 

' . '2 



• statistical analysis of data from the first year's uSiE of the Instructional 
Assessment System (IAS) at the Univ^rs^y of Washington (involving the instrument 
used in this study), Gillmore (Note i) reported fairly high correlations among 
items designed' to be diagnostic in, purpose. The correlations, cbinputed with 
classes as the unit, averaged abput 70.' Gillmoi4 suggested that these high 
correlations could indicate t^e presence of a strong halo effect, and Importantly 
may limit the diagnostic value of the iteiris. Gillmore cautibned that ' 



.one 



wh6 does well in his teaching in one area [possibly] also tends to do well in 
other areas, and vice versa. In other words, the halo may be,, in fact, an " 
ac^cur ate perception.'' (21-^22). . 

One very evident w^^^lSn vhich^ rating results can be used to Improve 

^ ■ ■ , ' . ■ *^ ' ' ' ■ ^ ' ' 

^tnatruction is for the' inskruetor feo concentrate on those itema on which he is ' 

rated low, and try to Improve in the ar^as assessed by the^ltems. In other 

. ' ■ • ' 

•>' . ' ' • . < « 

woyds he^can use items on Which he is rated , low asvdiagnost'ic of particular ^ 
problem^. However, insofar as items are highl^jy Correlated across classes, the 
lower rated items will not- be indicative of ^particular problems. The rcla- / 
tively low mean values will be a result of random error or be an artifact ot 
the intensity with which the item is worded. /Thus, high inter-item correlations 
restrict the diagnostic value of the instrument, whether the high correlations 
accurately reflect reality or not. 7 

Thus' fart^ we have based our arguments, J both for the' existence of a halo 
effect and for the consequent loss of item iJlagnosticity, on high inter-item 
correlations across classes. However, hald effects are usually thought of as 
emanating from an individual rather than a|group. Clearly for student itistru^- 
tional ratings,, a halo effect must be op er|^ ting ^within individuals in order to 
be operia^ng for classes. High Int^r-iteA correlations across individuals 
' • ' 5 
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Within a class wuld seem to be necessary: If not sufficient evidence of the 



eclstence/of a halo effect at the Individual level. . Furthermore, >^to reducel . 
.correlations among Item means across classes, which are caused by a ^halo effect, 
•one oust be able to rdduce the Inter-item correla°tlons within classes. « . . 

The purpose of the present study is to .determine if the correlations an^ng 
the diagnostic items of the IAS can be reduced by altering the standard prodeiure 
for administering the forms-. Specifically, standard administrative procedure ^ 
was altered In twoaregards. First, IAS forms contain. It^ms within three sectloKs 
%rtth Items within the Initial section being designed to be global or general In 
nature. Since students normally complete this sectlop prior to continuing on 
to the diagnostic items, the general Items may produce a set to respond at a 
given level throughout the Instrument. This level would probably be based on 
the students overall Judgment of the quality of the course and instructor. 

Thus, our first strategy for reducing inter-item correlations was to have experl- 

1 • - h . ' ■ . ■ - 

mental subjects respond to the dia^ostic portion of the^form prior to responding 
to the genera;! items. . .- 

Our second strategy was based on the jotlon that students 'possibly do not 
take the tim^ and effor.t to read and consider items carefully before responding, 
and, hence, do not make careful discriminations ba^d oJ Item content. To 
counteract this teij^ency. If It exists, we forced experimental' students to 
make fine dlscrlminations/among the dlagnosti<r^,items by requesting that they 
be ranked In terms of Importance in assessing teaching effectiveness parlor to 
being used to rate the^rourse. As a somewhat serendipitous result of .this " 
strategy, we were also able to obtain data on the relative importance of the 
various items as perceived by students. ' ' 
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Meth od 

Subjects. Nlnety-slx students from an elementary economics course at the 

' , . •' ■ / ; ' ' . . ■ 

University of Washington, participated' in the study. Four .quiz sections were 
randomly selected from the twenty sections comprioing the bourse. Two of these * 
sections were randomly chosen from the -four and combined into an -experimental * 
group, (N =*49). The remaining tiTO sectiono were combined to form the control 
group, (N = 47).^ The separate quiz sections met twice a week, whereas the 
entire groijf attended lectures three times per week. The evaluation instrument 
was administered te-^he four quiz sections separately at their weekly meeting.- 
Instrument. IAS fprm 3 (Gillmore, Note ^2) was administered to experi- ^ 
mentil and control- group's (see Appendix A foii- complete form). 

Procedure. Permission from the^ cours.e instructor had been secured prior 
to visiting the quiz sections and the teaching assistants (TA's) lox the 
se<*tio^were airare that their section might be chosen that day for partici- 
pation in the experiment. - ' 

Whemthe experimenter arrived at thd classrooms, the.TA left the room. 
The tailored Instructions (See Appendix C for complete i'.&^r-'tt , ens) were re4#. ^ 
aloud to the two sections comprising the experlmfental group. These students 
were Instructed to bypass the demographic Items and the four global items, to 
^ank order 'Separately the remaining, eighteen .diagnostic Instructor feedback 
and course information items, and then to grid in thoir evaluative responses 
to these diagnostic items before responding to the former items. Standard 
instructions (See Appendix B for complete instructions) were given to the two 
sections composing the^ control ^^roup, in which students responded to all evalua- 
tion items in the order in which they occurred. Sub^e^ently S^s -were asked 
,to rank order the diagnostic Insti-uctor and course items as to their importance 
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in meaisurlng teaching effectiveness. Both, control and experimental groups were 
told that they were rating the main instructor for the course and'^not the TA 
for their section. \' ,4. 

Results and Conclt»ions' «. 

t . — »• 

Th^ primary research hypothesis of this study concerned ilE^elf with the 

recluction of the halo effect as evidenced by high inter-item correi^aatois . For 

oiir ^aluative instrument the diagnostic items of interest are the instructor 

feedback and student information items of fable 1^ The resulting inter-item 

correlations .£9r these, twp sets of items, under the tailored instrujbtion^given 

to the experimental group and the. standard instructions' given to the control 

group, appear in Table .2. Italiaized correlations represen^^^-tk^se of the A 

experimental group. 

. ' . ' Insert Tables. 1 and 2 about here 

^ — ■ ~ ♦ 

The Inter-item correlations among items' 5 through 15 and ,16 through!' 22 / 

tend to be smaller for the experimental group than those of the control groui/ 1 

condition (Table 2). The average inter-item correlation (f^J wlihln both/ 
• ■ - • . /■ 

instructor feedback and student information items for the experimental group 

■ ' - ' — ' ■ /' ' ' ' 

was .29, whereas- tor the control group r^^ equalled .43 for the instructor 

feedback and .46 for the course^ information items. To teat for pairylse direc- 
tional differences between experimental and control group correlations a sign 
t^t (Winer, 1971) was perfbnred on the pairs/pf correlations irl /Table 2. Of *^ 
the 55 pairs of correlation coefficients within the diagno^ic /nstructor feed- 
back section, 33 of the experimental group correlations were /ess than those of 
the contr>al group, a difference significant at the ,01 level. The result of . ^ 
the sign test for the student informacion. items showed lf>/ot the 2l pairs of 
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correlations for fhe experimental group to-be less than those of the control 
gr^up - a difference significant at the .05 level. We can conclude from these 
re^ult^, then; that the tallpred Instructions given to the experimental group 
rejsulted in reduced inter-item correlations among the two sets of diagnostic 
evaluation items 



^ Additional evidence exists to show that the 'students within the experimental 

group continued to shox^more discrimination among items between the' instructor 
feedback section and course 'information section, as well as within tliese Svat^a- 
tive sections. Total ratings were computed for the eleven instrin^tor ^^el^tack 
items and the seven course information items' for both- experimental and control 
groups: The correlation between total i»structor feedback and course' information 
sections across all students within tJ^e^N^perimental gtoup equalled .53, whereas- 

the same correlation f or ^the control group was .71. These results may be taken 

/ ' * . ' • - 

as furthjer support for the contention that the experimental instructions cause - 

> ^' •/ ) ^ . ■ . - " 

the fftudent to look m0re'disc|rningly at the specific items rather than to tje • 
^ affected by some overriding attitude, olMiaio 'effect-, throughout the evaluation. 
^A£-test for differences/ between these correlations "did 'show, however* .no 

statistically significant difference between the groups. a 
» • It is interesting to ask if rxperlnientttl tteatmant alt^ed the* item means ^ 

in comparison to the control groujf'. ^-^estis were performed between experimental 

and control group mean responses on all diagnostic items within the instructor 

* - * 

feedback and student Information sections. No obtained t^ value between group c 
mean responses reached significance at the .05 level. Furthermore, the experl- 
mental group gavq-more favorable ratings pn ten items, and less favorable 
f [ ratings on eight items. This difference is not significant. Thus, there Is 

no evidence- that the experimea£al treatment aly^ered the overall level at which - 
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studeylis responded. 
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. . : y -d ' ' " 

■ ■■" ■ .... ■ . . , 7 ■ 

Importance Ranklnja^s 

• Instructions to both the experimental and ^control, group students Included 
having each student rank ^ the It ens within each section Iri terms of Importance. 
Jhe only difference betweeVi groins was that the, experimental group ranked thex 
Items prior to using them to rate the course and Instructor v the control group - 
d^d their rankings subsequent to their ratings. The imedian rank of each item V . 
for both groups Is found in Tabled. Also found in Table 3 are the relative . 
ranks of the items in terms/ of these medians, 



Insert Table 3 about here 



J In. general, there was a high degree of agreement between experimental and 

^control group membersjin terms of the relative importance of items. Thfe rank ^ 

correlation bepcfeen the ranks for the instructor feedback itemgi was .79 y the 

same correlation for the student infonriat ion Items was a perfect 1,00. 

V Within the instructor feedback section; the highest rankeifc-ttems were v 

I those dealing with the instructor's e^pplanations and organization. Items. 

dealing with characteristics of the instructor, e.g., his/her enthusiasm, interest, 

clarity of objectives, and availability of e^^tra help were rated as leas 

important. Within the student inftirmat ion section, amount leaded in the course 

•wap rated most highly followed ' by 4he relevance ^arid usefulaess df the couite' 

content. Instructor interest in student learning and use of class time were 

Intexfmediately ranked. Grading^ clarity of fesponsibilltlea , and assigned 

work were rated as least important. From a pedagogical point of vieW, the / 

rankings by students seems very sound. - However, it ^uld be kept in mind that 
. . / . . ■ - r y ' . - 

these rankings were applied to a specific course, not cou^seTs in general;"^ 
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* Discussion 

The pr^ry purpose qf this Istudy wag^o explore whether correlations among 
diagnostic items of a student ratings form could be reduced through using' 

♦ 

special instructions to raters. Clhese instjruc t ions differed <from standar^ in- 
structions injyo \fays: Students rank- ordered vitems in'^rms of Importance, 
prior to using them to rate the course/ and istudents responfi^d to the diagnost/q 
items prior to respjondlng to general evaluationafl. items. , J 

The special insti::uctions were sue cfessfu( in reducing the* inter-it em corre- 
ations relative to the samie cotrela^ions deriving from the ratings of a group 
using standard instructions. We theorized that this reduction could be indicativ 
of greater diagnostic value -of the rating^' of the experimental group. This 
Implication is cl^rlV based\on an Ind^^ct and statistical agreement, but 
reduced correlationa almc^hg items within a class are not sufficient to claim 
gi^-^#t^r-:d^^^ of those'^items. In the extreme case, intermit em corre- 

lations can, be reduced by including irrel^ant and pborly-w^itten items on 
the form; a metho^ which would clearly reduce diagnosticity. Further studies 
should be conducted in which the nfethodology of this study is combined with 
systematic manipulations of some sileciffc teachitj^ behaviors, e,g., poor vs 
good explanations, while homing others oonstant. Studies of thls^ort could 
more directly Iboilfront the issue of r^atWe item diagnosticity. . 



Further study Is also necessary to atsess the relative linportance of the 
two strategifea used In this study to influ^ce correlations among items. ' ^ 
Having students rank items before u3ing thto for rating the course ras con- 
founded, with having* students respond to the diagnostic items prior to responding , 
to th^ general items. It is presently impossible to determine which of these 
strategies is effective, or whether it is a combination of the two.' At least 
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two additional groups should be ^^ss^sM^s one for which standard Instructions 
are only modified to Include prior raflj^Plng of Items and one for which standard 
. instructions are only modified to Include .responding to diagnostic Items f J y w rT, 
To conclude, the basic purpose of this. study was achieved; that Is, non- 
standard Instructions were developed whlch>ucrcessf ully reduced correlations ^ 
^ong Items. We feel these lowered correlations may reflec.t an Increase In the 
Information arising from these Itemsc^eclflcally for the^dlagno^sls of instruc- 
tlonal problems. .Further research,^ more direct in nature, is needed to validate 
our assumotion. a , . 
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Table 1 > 



Items Within the Instructor Feedback and the Student - 

4 • . ' * * . 

. . ' Information Sections (Form B) -of ILAS . 

INSTRUCTOR FEEDBACK ITEMS 

5. Course organization was: * • 

6. Sequential presentation of concepts was: 

7. Explanatidns by Instructor were: 

8. Instructor'^s ability to preseint alternative explanations 
when needed was: 

9. Instructor's use of examples and illustrations was: 

10. Instructor's enhancementXof student interest iti 
; th^ material was: 

11. \Student confidencJe in instructor's knowledge was: 

12. Iristructox^'s enthusiasm was : 

.ft 

13. Clarity of course objectives was: ' 

* " t • 

lA- Interest level of class sessions was: 

15. Availability of extra help when needed was: 
STUDENT INFORMATION ITEMS 

16. Usje of class time was: 

<■ 

17. Instructor's interest in whether students learned was: 

\ ■ 

18. Amount you learned in the course was: 

19. Relevance and usefulness of course content j^g^^ 

20. Evaluative and grading techniques (tests, papers^ 
projects^ etc.) were: 

21. Reasonableness of assigned work was: 

22. Clarity of student responsibilities and requiremejats was: 
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. ' * ' ; Table' 2 • • ' 

,Intef-Itera Correlations o'f Instructor Feedback Items and 
St:udent Information Items in Experimental and Control ' Groups'*" 

^ 8 9 10 11 12 -13 • .14 ,15 17 18 19 20 ' 21 22 
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talic numbers represent the experimental condition. Items 5-15 arc the instructor feed- 
lack items, while items 16-22 are the student information items. 
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Table 3 . • 
Median Importance Ranks and Relative Rank of ^ 
Items In Experimental and Control Conditions 
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Item 

Instructor Feedback Items 


Median 
Exper. 
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Control. 
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Student Information Items 
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Appendix B: Regular Control Group Instructions <» 

Hello, I'm _^ ^ from ihe Educatlon^rl Assessment Center 

and f'm doing a study to learn more about how students rate their ^courses • 
I'd like you to respond to this questionnaire. While this is not a regular 
end-of-the-quarter rat'lng, the results will be given to the Instructor after " 
the course is over. If Items refer" to the Instructor, rate your professor 
and not your T.A.- Please respond to every qliestion,, Does anyone need a pencil? 

Beginning at the top of^ ^he questionnaire, where you are ajsked for in- 
.formatlon about yourself, please respond to the entire questionnaire- I'll 
wait. (Wait.) Now, let's go back, to Section II. Rank order all of the" 11 
items from 1 to 11 Judging what you believe are most Important as feedback 
items to the Instructor's teaching effectiveness. Remember, 1.1s most Important 
and 11 is least important. Place your ranks to the left of the printed Item 
nimber. Do not go back and change your responses. Do the saihe for the 7 
items in Section III, ranking them from 1 to 7, Again, 1 is most important 
and 7 is least Important. 

Are there any questions? f**" * ^ 

(When finished, thank the students.) 
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Appendix C: Tailored Instructions * 



.Hello, I'm _;_^froin the Educational Assessment Center and I'm doing 

a study to learn more about how students rate their courses.- I'd'^ke^ou to ' 
respond to this questionnaire. VThile this is not a regular end -of- quarter 
rating, the' results will be given to the instructor after the course is over. If 
items refer to the instructor,' rate your professor and not your T. A. Please 
respbnd to every quest ipn. .I'm going to' pass out questionnaires. Please leave 
theia face-down until! give you further instructions. Does anyone need a pencil? 

I would like to begin with Section II. Read the items - there are 11 of 
thefn. Rank order all of the items in Section II from 1 to 11 judging how im- 
portant they are as feedback items to the instructor's teaching effectiveness. 
1 is most important and 11 is least Important. Place your ranks to the left of 
the printed item number. (Pause) Go back and grid in the items in the order 
in which you ranked them. . .one first, and so on. Do the same for Section III, 
ranking the 7 items from 1 to 7. Again, 1 is most Important and 7 is least 
important. 

l-Then you have completed Section III, go 'to the top of the questionnaire 
where you are asked for information about yourself. Please respond. Then go to 
Section I. Do not rank order these items. Simply respond to the choicee. 

Are there any questions? " 

(When finished, thank the students.) 
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Appendix C: Tailored Instructions ^ ^ 

.Hello, I'm _J_^from the Educational Assessment Center and I'm doing 

a study to learn more about how students rate their courses, • I'd'^ke^ou to ' 
respond to this questionnaire. VThile this is not a regular end -of- quarter 
rating, the^ results will be given to the instructor after the course is over. If 
Items refer to the instructor,' rate your professor and not your T. A. Please 
respbnd to every quest ipn. .I'm goinp to' pass out questionnaires. Please leave 
theia face-dovm untir I give you further instructions. Does anyone need a pencil? 

I would like to beRin with Section II. Read the items - there are 11 of 
thein. Rank order all of the items in Section II from 1 to 11 judging how im- 
portant they are as feedback items to the instructor's teaching effectiveness. 
1 is most important and 11 is least Important. Place your ranks to the left of 
the printed item number. (Pause) Go back and grid in the items in the order 
in which you ranked them. . .one first, and so on. Do the same for Section III, 
ranking the 7 items from 1 to 7. Again, 1 is most Important and 7 is least 
important. 

Vlhen you have completed Section III, go 'to the top of the questionnaire 
where you are asked for information about yourself. Please respond. Then go to 
Section I. Do not rank order these items. Simply respond to the choices. 

Are there any questions? " 

(When finished, thank the students.) 
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